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Abstract 

We consider the problem of calculating distance correlation coefficients be¬ 
tween random vectors whose joint distributions belong to the class of Lancaster 
distributions. We derive under mild convergence conditions a general series rep¬ 
resentation for the distance covariance for these distributions. To illustrate the 
general theory, we apply the series representation to derive explicit expressions 
for the distance covariance and distance correlation coefficients for the bivariate 
normal distribution and its generalizations of Lancaster type, the multivariate 
normal distributions, and the bivariate gamma, Poisson, and negative binomial 
distributions which are of Lancaster type. 
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1 Introduction 

The concepts of distance covariance and distance correlation, introduced by Szekely, et 
al. [27, 31], have been shown to be widely applicable for measuring dependence between 
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collections of random variables. As examples of the nbiqnity of distance correlation 
methods, we note the resnlts on distance correlation given recently by: Szekely, et ah 
[21, 28, 29, 30, 31], on statistical inference; Sejdinovic, et ah [26], on machine learning; 
Kong, et ah [10], on familial relationships and mortality; Zhon [33], on nonlinear time 
series; Lyons [17], on abstract metric spaces; Martinez-Gomez, et ah [18] and Richards, 
et ah [20], on large astrophysical databases; Dneck, et ah [5], on high-dimensional 
inference and the analysis of wind data; and Dneck, et ah [6], on a connection with 
singnlar integrals on Enclidean spaces. 

A resnlt which is of fnndamental importance in distance correlation theory is the 
explicit formnla for the empirical distance correlation coefficient [31, pp. 2773-2774]. 
By combining that explicit formnla with the fast algorithm of Hno and Szekely [9], it 
becomes straightforward to apply distance correlation methods to real-world data sets. 

On the other hand, the calcnlation of popnlation distance correlation coefficients 
remains an intractable problem generally. Szekely, et ah [31, pp. 2785-2786] calcnlated 
the distance correlation coefficient for the bivariate normal distribntion; Dneck, et al. 
[4, Appendix] extended that resnlt to the general mnltivariate normal distribntion; 
and Dneck, et al. [5] calcnlated the affinely invariant distance correlation coefficient 
for the mnltivariate normal distribntion. Otherwise, no snch resnlts are yet available 
for any other distribntion. Hence, the state of distance correlation theory hitherto 
is that the empirical coefficients can be calcnlated readily bnt the opposite holds for 
their population counterparts, generally. Consequently, it was not possible to calculate 
distance correlation coefficients explicitly for given nonnormal distributions in terms of 
the usual parameters that parametrize these distributions, or to ascertain for nonnormal 
distributions any analogs of the limit theorems derived by Dueck, et al. [5, Section 4]. 

We describe in detail the difficulties arising in attempts to calculate the population 
distance correlation coefficients. Let p and q be positive integers. For column vectors 
s G and t G denote by ||s|| and ]|f|| the standard Euclidean norms on the 
corresponding spaces; thus, if s = (si,..., Sp)"'^ then ||s|| = (sf ■ • • + and 

similarly for ]|f||. Given vectors u and v of the same dimension, we let {u,v) be the 
standard Euclidean scalar product of u and v. For jointly distributed random vectors 
(X, y) G X W and non-random vectors (s, t) G x W, let 

i’x,Y{s,t) = Eexp[i(s,X) + i(t,y)], 

i = \/^, be the joint characteristic function of {X,Y), and let = '0x,y(s,O) 

and '0Y(t) = '0 x:,y(O, t) be the corresponding marginal characteristic functions. For any 
2 ; G C, let denote the squared modulus of z; also, we use the notation 

V]-(P+l)/2 

F((p + l)/2)- 

In the case of distributions with hnite first moments, Szekely, et al. [31, p. 2772] defined 
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V{X,Y), the distance covariance between X and F, to be the positive square-root of 


V^{X,Y) = 


k+i 


ds dt 


Iplq Jrp+i 

and they dehned the distance correlation coefficient between X and Y as 


( 1 . 2 ) 


7^(X,F) 


v(^,y) 

VV(X,X)V(F,y) 


(1.3) 


if both V(X, X) and V(F, Y) are strictly positive, and otherwise to be zero [31, p. 2773]. 
For distributions with hnite hrst moments we have 0 < TZ{X, Y) < 1, and TZ{X, F) = 0 
if and only if X and F are mutually independent. 

For given random vectors X and F, the fundamental obstacle in calculating the 
population distance correlation coefficient (1.3) is the computation of the singular in¬ 
tegral (1.2). In particular, the singular nature of the integrand precludes evaluation of 
the integral by expanding the numerator, |'^x,y(’S,t) — , and subsequent 

term-by-term integration of each of the resulting three terms. 

In this paper, we calculate the distance correlation coefficients for pairs {X, Y) of 
random vectors whose joint distributions are in the class of Lancaster distributions, a 
class of probability distributions made prominent by Lancaster [15, 16] and Sarmanov 
[23]. The distribution functions of the Lancaster family are well-known to have at¬ 
tractive expansions in terms of certain orthogonal functions (Koudou [14]; Diaconis, et 
al. [3]). By applying those expansions, we obtain explicit expressions for the distance 
covariance and distance correlation coefficients. 

Consequently, we derive under mild convergence conditions a general formula for 
the distance covariance for the Lancaster distributions. We apply the general formula 
to obtain explicit expressions for the distance covariance and distance correlation for 
the bivariate normal distributions and some of its generalizations, for the multivariate 
normal distributions, and for bivariate gamma, Poisson, and negative binomial distri¬ 
butions. We remark that explicit results can also be obtained for other Lancaster-type 
expansions obtained by Bar-Lev, et al. [2]; however, we will omit the details for other 
cases because the formulas derived here are entirely representative of other cases. 


2 The Lancaster distributions 

To recapitulate the class of Lancaster distributions we generally follow the standard 
notation in that area, as given by Koudou [13, 14]; cf., Lancaster [16], Pommeret [19], 
or Diaconis, et al. [3, Section 6]. 

Let {X, n) and {y,^) be locally compact, separable probability spaces, such that 
L^(p) and L‘^{u) are separable. Let a, a probability measure on A” x 3^, have marginal 
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distributions /r and u; then there exist functions and Lfj such that 
a(da;, dy) = K^{x, dy)n{dx) = L„{dx,y)u{dy). 


We note that K^j and Lg. represent the conditional distributions of Y given X = x, and 
X given Y = y, respectively. 

Let C denote a countable index set with a zero element, denoted by 0. Let {Pn : n G 
C} and {Qn : n G C} be sequences of functions on X and y which form orthonormal 
bases for the separable Hilbert spaces L^(/i) and respectively. We assume, by 

convention, that Pq = 1 and Qq = 1. 

Since the tensor product Hilbert space L^(p,(8)z/) = L^(/r) is separable there 

holds, for a G ® z/), the expansion 

cr(da;, di/) = EE Pm^nPmip^Q niy) y^i^dx^ vi^dy') ^ (2.1) 

mSC ndC 


(x, y) & X X y. Letting Sm,n denote Kronecker’s delta, the probability measure a is 
called a Lancaster distribution if there exists a nonnegative sequence {pn : n G C} such 
that 

j Pm{x) Qn{y) 0'(dx, dy) = pm Sm,n 

for all m,n E C; in particular, po = 1- The sequence {pn : n G C} is called a Lancaster 
sequence, and the expansion (2.1) reduces to 

(T{dx, dy) = y^^pnPn{x)Qn{y)p{dx)v{dy). 

riGC 


Koudou [13, pp. 255-256] characterized the Lancaster sequences {pn : n E Cj such that 
the associated probability distribution a is absolutely continuous with respect to p. 0 z/ 
and has Radon-Nikodym derivative 


g^(dx, dy) 
p(dx) ly(dy) 


Pn K(x) Qn(y) e L^(jU(g)iy), 

nec 


(x,y) E X X y. 

In the sequel, we consider the case in which df = and 3^ = and the underlying 
random vectors X G and E G have joint distribution a and marginal distributions 
p and ly, respectively. We assume that p, z/, and a are absolutely continuous with 
respect to Lebesgue measure or counting measure on the respective sample spaces 
and we denote their corresponding probability density functions by (px, and <px,Y, 
respectively. This yields the expansion, 

(j)X,Y{x,y) = (j)x{x)(j)Yiy)'^PnPn{x)Qn{y)- ( 2 . 2 ) 

neC 

We will refer to (2.2) as the Lancaster expansion of the joint density function (pxy- 
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3 Examples of Lancaster expansions 

In this section, we provide examples of Lancaster expansions (2.2) for the bivariate nor¬ 
mal distribution and some of its generalizations, the multivariate normal distributions, 
and the bivariate gamma, Poisson, and negative binomial distributions. In the sequel, 
we denote by Nq the set of nonnegative integers. 

3.1 The bivariate normal distribution and some of its gener¬ 
alizations 

Let {X,Y) follow a bivariate normal distribution with mean vector 0 and covariance 
matrix 



denoted by (X, Y) ~ A/2(0, S). The joint probability density function of (X, Y) is 



x, 1 / G M, and the marginal density functions are given by 



In this case, the index set C is No- For n G No, let 



a; G M, denote the nth Hermite polynomial, n = 0,1,2,.... It is well-known that 
the polynomials {Hn : n G No} are orthogonal with respect to the standard normal 
distribution and form a complete orthogonal basis for the Hilbert space L^(X). Also, 
the Lancaster expansion of 4>x,y is given by the classical formula of Mehler, which states 
that, for x,y eM., 



(3.1) 


n=0 


and this series converges absolutely for all x, i/ G M. 

We remark that there are numerous extensions of Mehler’s formula which repre¬ 
sent Lancaster-type expansions for generalizations of the bivariate normal distribution. 
Sarmanov and Bratoeva [25] consider series expansions of the form 


OO 



n=0 


(3.2) 
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x,y eR, where the sequence of real numbers {pn : n = 0, 1 , 2,...} satisfies Pn < 

oo. Sarmanov and Bratoeva proved that for the expansion (3.2) to be nonnegative, and 
therefore to be a valid probability density function, it is necessary and sufficient that 
the sequence {p„} be the moment sequence of a random variable ^ supported on the 
interval [—1,1]. 

An example of this generalization is the case in which (pxy is a mixture of bivariate 
normal densities; put in the formula 

4>xy{x, y) = \ [0x.y (a:, y] p) + (a:, y; -p)] 

OO 2n 

= 4>xi.x)(t>Y(y) E 7^ H 2 „(x)H 2 jy); (3.3) 

this corresponds to the case in (3.2) in which 


Pn 


p", n even 
0, n odd 


This mixture density also provides an example of a distribution for which the Pearson 
correlation coefficent is zero whereas the distance correlation is positive. 


3.2 The multivariate normal distribution 


Let X G and Y G be random vectors such that (X, Y) ~ A/),+g(0, E), a (p + q)- 
dimensional multivariate normal distribution with mean vector 0 and positive dehnite 
covariance matrix 


E = 




(3.4) 


where Ex, Ey, and Exy = Eyx^ are p x p, q x q and p x q matrices, respectively. We 
denote by 0x,y the joint probability density function of (X, Y), and by (j)x and the 
marginal density functions of X and Y, respectively. 

We now describe the Lancaster expansion of (I)x,y, a result derived in [32]. In this 
case, the index set C is Ng^'^, the set of p x g matrices with nonnegative integer entries. 

For a matrix of summation indices N = (Nrc) G Ng^*^, define N\ = nr=i IlLi 
For r = 1,... ,p, let 

Nr. = J2^rc 

C=1 

and set Af*. = {Ni .,..., Np.). Similarly, for each c = 1,..., g, dehne 


p 

Af.e = 

r=l 
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and set N= {N.i,..., N.g). Further, we define 

p <? 

AT.. = 

r=l c=l 

and note that N.. = X]r=i = SLi ^-c- 

Denoting by (Lxy)rc the (r, c)th entry of we also define 

'^XY^ = ]^[(Sxy)rc]^’'‘'- 

r=l c=l 

We now introduce the multivariate Hermite polynomials. For any p G N, fc = 
(fci,..., /cp) G Ng, and x = (ti, ..., Xp) G define x^ = Xi^ ■ ■ ■ Xp^ and define the 
differential operator, 



The fcth multivariate Hermite polynomial with respect to the marginal density function 
is defined as 

The Lancaster expansion of the multivariate normal density function 0x,y is given by 
the generalized Mehler formula [32]: 

Xj ^ 

(j)x,Yix,y) = (px{x)(j)Yiy) -^^HN,.ix;J:x)HN.,iy,^Y), (3.6) 

N 

with absolute convergence for all x G M^, p G M'^. 

To calculate the affinely invariant distance correlation coefficient between X and Y, 
as defined by Dueck, et ah (2014), we need the Lancaster expansion of the joint density 
function of the standardized random vectors X = and Y = It is 

straightforward to verify that {X,Y) ~ A/),+q(0,A) where 


A = 




(3.7) 


with Axy = and then we deduce from (3.6) that the Lancaster 

expansion for (A, Y) is 




A 


XY 


N 


N 


N\ 


Hn,.(x;I^) Hm.Jv;!,)- 


(3.8) 
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3.3 The bivariate gamma distribution 

The Lancaster expansion for a bivariate gamma distribution, which was derived by 
Sarmanov [24, 22], can be stated as follows (see Kotz, et al. [11, pp. 437-438]). 

For a > — 1 and n G Nqj the classical Laguerre polynomial is dehned by 

= -a:-“exp(a:)( —) exp(-a;) 
n\ Vdx/ 

_ (a + 1)„ {~^)j (^•9) 


X > 0, where 

, , r(a + n) , 

(a)n = —— = «(« + 1) ■ ■ ■ (a + n - 1), 

r(a) 

n = 0,1,2,..., denotes the rising factorial. By standardizing the classical Laguerre 
polynomial, we obtain the orthonormal version [8], 



/ (a + 1)„ 
V n\ 



+ 1)„^ ^ 


(-^)j 

^ (« + l)j ■ 


Let A G (0,1), and let a and (3 satisfy a > (3 > Q. Sarmanov [24, 22] derived for 
certain bivariate gamma random variables (X, Y) the joint probability density function. 


(fxxix.y) = (t)x{x)(f)Y{y)^anL[^ ^\x) ^\y), 

n=0 


X, 1 / > 0, where 


(^n. — 


.(«)r 


1 1/2 


A", 


n = 0,1, 2,.... The corresponding marginal density functions are 


(fxix) 


1 


r(a) 


X 


exp(—x) 


( 3 . 10 ) 


( 3 . 11 ) 


and ^ 

(l>Y{y) = exp(-i/), 

which we recognize as the density functions of one-dimensional gamma random variables 
with index parameters a and f3, respectively. 

We remark that if a = f3 then the density function (3.10) reduces to the Kibble- 
Moran bivariate gamma density function, Corr(X, E) = A [11, pp. 436-437], and 
(3.10) represents the Lancaster expansion for (X, Y). On the other hand, if a ^ (3 then 
Corr(X, Y) ^ A. 
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More generally, Griffiths [8] showed that a series expansion of the form 

OO 

0A,y(T,2/) = (t)x{x)(t)Y{y)^pnL^n~^\x)L^J^~^\y) (3.12) 

71=0 


represents a valid bivariate probability density if and only if 


Pn 


(a 


•^71) 


( 3 . 13 ) 


where A„ is the moment seqnence of a random variable ^ concentrated on [0,1]. 


3.4 The bivariate Poisson distribution 


For a > 0 and x, n G No, let 


k=0 \ / \ / 


k\ 


( 3 . 14 ) 


denote the Poisson-Charlier polynomial of degree n. For A G [0,1], Koudou [14, Section 
5] (cf., Bar-Lev, et al. [2], Pommeret [19]) showed that there exists a bivariate random 
vector {X, Y) with probability density function 


0A,y(^7 y) = 0x(t) 0y(i/) ^ A” Cn{x; a) Cn(y; a), (3.15) 

n=0 

x,y E Nq. The corresponding marginal density functions and are given by 

Mk) = Mk) = 

fc G No, so that X and Y are distributed marginally according to a Poisson distribution 
with parameter a. The series (3.15) is an expansion of Lancaster type, a special case 
of (2.2), and the resulting distribution is called a bivariate Poisson distribution. 


3.5 The bivariate negative binomial distribution 


The orthonormal polynomials for the classical univariate negative binomial distribution 
are the (normalized) Meixner polynomials, given by 



n! / 


E 


k=0 


{-n)k {-x)k 
W)kk\ 


(1 



( 3 . 16 ) 


for /3 > 0, 0 < c < 1, and x G N. Koudou [14, Section 6] showed, by an approach 
similar to that used for the bivariate Poisson distribution, that there exists a bivariate 
random variable {X,Y), with identical marginal negative binomial densities, 

(j)x{x) = 0y(x) = (1 - cf 

.T 
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x G No, and with joint probability density function, 

OO 

(pxxix, y) = (l>x{x) 0y (?/) ^ A” (3.17) 

n=0 

where x,y E Nq, and 0 < A < 1. The expansion (3.17) represents a Lancaster expansion 
of the joint density function. 


4 Distance correlation coefficients for Lancaster dis¬ 
tributions 


In this section, we derive a general series expression for the distance correlation coeffi¬ 
cients for Lancaster distributions with density functions of the form (2.2). For a joint 
density function given by (2.2) and n E C, we introduce the notation 


K(5) = Eexp(i (5,X))P„(X), 


(4.1) 


s E MP, and 

Q„(f) = Eexp(i(f,E))g„(E), (4.2) 

t E To verify that the expectation in (4.1) converges absolutely for all s E we 
apply the Cauchy-Schwarz inequality to obtain 

E|exp(i(s,X))P„(X)| < (E|exp(i{s,X))p)'/^ (E|P„(X)|2)'/' = 1, 

because {P„ : n G C} is an orthonormal basis for the Hilbert space L^(/i). In particular. 


|P„(s)| <E|exp(i(s,X))P„(X)| <1, 


for all s G and, similarly, \Qn(t)\ < 1 for all t E 
In the following result, we will use the notation 


•^j,k 



ds 

s||P+^ 


and 

e,, ^ IQM a(-*) 

j, k E C, whenever these integrals converge absolutely. 
We now state the main result. 


Theorem 4.1. Suppose that the random vectors X G and E G have the joint 
probability density function (2.2). Then, 

V^(X,Y) = — E E 

kec,k ^0 

whenever the sum converges absolutely. 


(4.3) 
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Proof. Rewriting the Lancaster expansion (2.2) in the form, 

(j)X,Yix, y) - (j)x{x) (j)Yiy) = (pxix) (j)Y{y) Qniy), 

ngC,n/0 

and taking Fourier transforms on both sides of this identity, we obtain for all s G 
and f G the expansion 

^X.Y(s,t) - V’x(s)l/’y(t) = ^ PnVnis) Qn{t). (4.4) 

n£C,n^0 

This identity is valid subject to the requirement that we may interchange summation 
and integration, which is justihed by the assumption that the sum in the hnal result 
converges absolutely. Using (4.4) we deduce that 

|^A,y(s,t) -'0A(s)'0y(t)P = ('0x,y(sU) - ^A(s)i/’y(t)) ('0x,y (sU) -'0x(s)^y(f)) 

= P3Pk'Pj{s)Vk{-s)Qj{t)Qk{-t). 

ieCjyo keC,k^o 

Next, we integrate this expansion with respect to the measures ds/||s||^’''^ and df/||f||'^’''^; 
this requires that we again interchange summation and integration which, by assump¬ 
tion, we are able to do. On carrying through these procedures, we obtain (4.3). □ 


Remark 4.2. We note that Theorem 4.1 can be extended to the more general a- 
distance covariance and correlation measures treated by Szekely, et ah [31, p. 2784]. 
For a G (0, 2), define 

_ 2 7rP/2r(l - la) 

“ a2“F(l(p + a)) 


and let 


Vl(X, Y) = 


\ i ’ X . Y ( s , t ) - i ) x ( s ) i ’ Y ( t)f 


ds dt 


7p.a7g,a Jrp+1 I|s||^+“ 11^11'^+“ 

be the a-distance covariance between the random vectors X and Y. Further, dehne 



Jm.p 


Vj{s)Vk{-s) 


ds 

||s||P+“ 


and 

j, k E C, whenever these integrals converge absolutely. Then, the extension of Theorem 
4.1 to the a-distance covariance measures is that 

Y, Y P,PtA,Ma)B,x(.a), 

lp,adq,a kGC,k^0 


whenever the sum converges absolutely. The proof of this result is similar to the proof 
of Theorem 4.1. 
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5 Examples 

In this section, we establish the versatility of Theorem 4.1 by applying it to compute 
the distance correlation coefficients for the bivariate normal, multivariate normal, and 
bivariate gamma, Poisson, and negative binomial distributions. We verify for each 
example the absolute convergence of the series resulting from Theorem 4.1, for that 
convergence property cannot be obtained in general from the general theorem. In 
developing each example, we retain the corresponding notation in Section 3. We also 
remark that the crucial singular integral in the theory of distance correlation [6, 31] is 
evaluated in terms of the gamma function; however, even slight generalizations of that 
integral can be evaluated only in terms of the Gaussian or the confluent hypergeometric 
series; this explains the appearance of those series in the ensuing examples. 

5.1 The bivariate normal distribution and some of its gener¬ 
alizations 

In the sequel, we use the standard double-factorial notation, 

{ 1, if n = —1, 0 

n(n-2)(n-4)---l, if n = 1, 3, 5, 7,... 
n{n — 2)(n — 4) ■ ■ ■ 2, if n = 2,4, 6, 8 ,... 

Proposition 5.1. Let {X,Y) ~ A/2(0,S), a bivariate normal distribution with corre¬ 
lation coefficient p. Then, 



(5.1) 


and this series converges absolutely for all p G (—1,1). 

Proof. Starting with the Lancaster expansion of the bivariate normal density func¬ 
tion, as given in (3.1), and using the definitions of Vn and Qn in (4.1) and (4.2), 
respectively, we obtain by substitution and integration-by-parts. 



= (is)"exp(-is2). 


s G M. Therefore, 



(_l)fc ii+fe — 3)!!, if j -I- A; is even 


0 , 


otherwise 
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since the latter integral is a moment of the Af{0, |) distribution. By Theorem 4.1, we 
obtain 

j,k>0 
j-\-k even 

Setting j + k = 2i with £ > 1, the double series reduces to 
V^(X,Y) = 


which is the same as (5.1). 

The absolute convergence of (5.1) can be verihed by comparison with a geometric 
series. Moreover, it can be shown that the series reduces to the explicit formula, 

— p + — psm~^ p/2 — (5.2) 

which is identical with the result obtained by Szekely, et ah [31, pp. 2785-2786]. □ 

Having obtained V(X, V), we let p —)■ 1— to obtain the distance variances V{X,X) 
and V(V, V); here, we are applying a well-known result that if (X, Y) ~ A/'2(0, S) where 
Var(X) = Var(y) and p = 1 then X = Y, almost surely. By applying properties of 
Gauss’ hypergeometric series, as was done by Dueck, et al. [5, p. 2318], we obtain 

V\X,X) = V\Y,Y) = ^-^^ 

3 71 

It is straightforward to extend the above results to generalizations of the type given 
in equation (3.2). 

Corollary 5.2. Let {X,Y) be a bivariate random variable distributed aeeording to a 
density function as given in (3.2). Then 

= 1 X] ft L + A'- 3)!!)t (5.3) 

j+k even 

For the example given in (3.3), the series expansion in (5.3) reduces to an explicit 
formula similar to (5.2). 


. OO 

£pyi)“((2<-3)!!)= ^ , 


TT 


£=1 


j,k>l 

j+k=2l 


j\k< 


4 ^ IS (2<-)! 

-h - P*”)! 


TT 


TT 


1=1 


(2l)\ 


.(2"-2), 
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Corollary 5.3. Let (X, Y) be a bivariate random variable distributed aeeording to a 
density function as given in (3.2) with pn = for n even and Pn = 0 for n odd. Then 


V^(X, X) = ^ sin V + i - psin ^ p/2 - \/ 4 -p 2 + ^ j. ( 5 . 4 ) 


Proof. Proceeding as in the case of the bivariate normal distribution, we obtain 



j+k=i 




Using the standard notation, 2 E 1 , for Gauss’ hypergeometric function we see that 



It is well-known (see Andrews, Askey, and Roy [1, , pages 64 and 94]) that 

2Fi(-|,-|; = psin-V+ (1 


On applying this formula to the above expression, we obtain (5.4). 


□ 


5.2 The multivariate normal distribution 

In this subsection, we will make extensive use of the notation N^., N.c, N^., Af.*, 
and N.. from Subsection 3.2 for the multi-index matrix N G We now establish 

the following result. 

Proposition 5.4. Suppose that (X, X) ~ Ap+q(0,S), where S is given in (3.4). Then 
the affinely invariant distance covariance, V^(X, X), is given by 



(5.5) 


where the sums are taken over all non-zero J,K ^ such that all components of 
J*. -h K^. and J.* + Kare even, 



(5.6) 
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and 


“ T(\(J..+K..) + \q) 


(5.7) 


Proof. In this case, the index set C is and we write the Lancaster expansion 

(3.8) of (X, Y) in the form 

jyN 


AT 7^0 


To calculate the Fourier transform Vn corresponding to X, we apply the definition 
(3.5) of the multivariate Hermite polynomials and integration-by-parts to deduce that 
for s G 

VNis) = / exp{i{s,x))(j)j^{x)HN^,{x-,Ip)dx 

N,. 


= (-l) 


N.. 


exp(i(s,x)) ( ^ ) (j)^{x)dx 


d 


JV*. 


( ^ ) exp(i(s,a:)) dx 


= (is)"^*- 

;IV.. N,. 


(pxix) exp(i(s,a:)) dx 


Similarly, 


t eW. Therefore, 


Vj{s)Vk{-s) 


= i^“s^*- exp(-|(s,s)). 


QNit) = exp(-|(f,f)). 


ds 


|s||p+i 


= (-l)^“i 


K • • 1 J 11 -\-K • • 




exp(-(s,s))- 


ds 


s||P+i 


We now change variables to hyperspherical coordinates: s = ru, where r > 0 and 
oj = (cui, ... ,Up) E SP~^, the unit sphere in M^. Then the latter integral reduces to 


• • -\-K • • —2 


exp(—r^) dr 


LJ 


J,.+Ar*. 


dcj. 


I sp- 


The integral over M+ is evaluated by replacing r by and we obtain its value as 

It + 

It is easy to see that the integral over 5'^“^ equals zero if any component of J*. 
is odd. For the case in which each component of J*. -|- K^. is even, we obtain 


UJ 


J-if9 -\-Kif 




dw = /l(^P-^)E(a;'^*-+^*-), 
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where = 2'k'p/'^/V(^ p) is the surface area of and oj now is a uniformly dis¬ 

tributed random vector on S^~^. It is well-known that the random vector {ujf,... ,ujp) ~ 
a Dirichlet distribution with parameters ..., so, by a classical for¬ 
mula for the moments of the Dirichlet distribution [11, p. 488], 

= Hjp) ni.ir(|(J.. + g.. + i)) 

^ ’ ir(i)]p r(i(j.. + /f..) + ip) ■ 

Collecting together these results, we obtain 

Jrp idii 

where Aj ^ is given in (5.6). A similar expression can be obtained for 

from which the hnal result (5.5) follows. □ 

As a consequence of Proposition 5.4, we now derive the value of the affinely in¬ 
variant distance variance, V‘^{X,X), when X has a multivariate normal distribution. 
We remark that the derivation of this result given by Dueck, et ah [5, Corollary 3.3] 
utilized the theory of zonal polynomials, whereas the proof which we now give is by a 
simpler method. 


Corollary 5.5. (Dueck, et ah [5, Corollary 3.3]) Suppose that {X,Y) ~ A/2p(0, S), 
where S in (3.4) satisfies Axy = P Ip in (3.1). Then 


7p-i 


V2(X, F) = yPfi-l-Pip; p^) - 2 2Ei(-i -i; \p; {p^) + l] 

Ip 


and 


'I'llv v\ _ 

^2 


V\X,X) =47r- 


Ti-,p)T{-^p+l) .1 1.1 . . 

- 2 22 ^i(-2 , -21 2 P 1 i) + I 


iph(p+m 

Proof. Setting p = q and A^y = pIp in (5.5), we obtain 


-,IvHx,y) = 


A, K B 




E E ''J.K OJ K ,, 

J^O,K^O 

J*.+/<■*.,J.*+K'.t even 

By decomposing the set of all (J, K) into a disjoint union. 


(5.8) 


(5.9) 


{(J,K)-.J + KytO} = {{J,K):J^O,KytO} 

U{{J,K) : J ^O.KjiO} 

U{(J,JC): J/0,JC = 0}. 


(5.10) 
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and noting that the summand in (5.10) is symmetric in (J, K), we obtain 

7jU(A'.y)= 5 : 5 : 

•* even 


J+K^O 

Jt.+Kt. ,J.t+K.t even 


E E bj.k 


J=0,Kf^0 

Jt.+Kt. ,J.t+K.t even 


J\K\ 


(5.11) 


Note that 


p p p 

\J~\-K _ 1 PI \ ^«/rc^“^rc _ T \ { rxX \Jrr~\~^rr T \ ( r\K \Jrc'\~I^rc 




r=l c=l 


■ n(Ac)‘ 

r^c 


r=l 


This is non-zero iff Jrc = K^c = 0 for all r 7 ^ c, in which case J and K are diagonal 
matrices, and then we have 

(^pj -hJpp+i^liH- \-Kpp 

J * • dfi • (t^ll H“ -^11 1 ‘ ‘ ‘ ^ Jpp •=(: 5 

and 




r=l 


r=l 


Therefore, 

= wn + "' + f-- + "' + ~ nr (|(+ K„ + 1)), 

r (^('^ii + ■ ■ ■ + Jpp + -^11 + ■ ■ ■ + Kpp) + 2^) 

and this yields for the hrst term in (5.11) 

ipipV-^^ 


EE 

:^o 

*+K 

E 


Aj,k Bj^k 


J+K^O 

Jt.+Kf.,J.t+K.f even 


J\K\ 


E 


A' 


2 p 

J,K 


- \-Kp 


TL\ ,... ^TLp ^0 1 1 —StT-I ,... •,’Jpp~\~Kpp — ‘2i7Xp 

niH- ^Up^O 


J\K\ 


E 


P 


n\ ,...,ri.p>0 and even 
niH-hnp7^:0 


+ 2 np 

' T{ni + - 

■ ■ -Lr; - 11 ^ 1 

2 ) TTrfrj 1 M 


T {ni + -- 

■+n,+ lp} ICK+JJ 


E 


1^11 ~\~Kw — 272x ,... ^Jpp~\~Kpp — 27T.p 


J\K\ 


X 


(5.12) 
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Since 


E 


J w~\~Kw —\ .. ^tJpp~\~ Kpp — ‘2TLp 

we obtain that (5.12) equals 


J\K\ 


n E 

T '—1 J — ‘2i7Xf 


J ^ K ^ 

tj pp . j. i. pp • 


n 


q( 2 n,)!’ 


E 


ni,...,np>0 
niH-hrip^O 




r (ni + ■ ■ ■ + )lp + Ip) 


hi 


E 


n n 2 


r(n- 5) 

^Lr(ii + ip) 


( 2 p) 


E n 

ni-\ - \-np=n r=l 


hK+i)]^ 

(2np)! 


However, 


E n 

ni-\ - \-np=n r=l 


[rK + i)]' 

(2nr)! 


E n 




niH-hn. 


J-1 22«’'np!(|)p 
=n r=l I'' 


TT^ 


E n 

Tii-\ - \-np=n r=l 


P 

2)nr 


Ur'. 


TtP (|p)n 


22 n ’ 

SO we obtain that the first term in (5.11) is given by 


^ Lr(n + b) 


E 


W"^^ = E 


22 n 


n=l 
= 'ttP 


r(-i)(-|)n 


r(b)(b) 


nl 


^ An-l)? ^ HU-l)n 

[nbw h Ap)n n\ 

= [2Fi{-A -\]\p] pA - 1 ] . 

By similar arguments, we obtain for the second term in (5.11): 

{pipY^^ 


EE 


Aj,k Bj^k 


J=0,Kf^O 

J t.+Kt. ,J .*+K .t are even 


J\K\ 


= 4vr7p-i [ 2 FA-A-h b; IpA - 1 ] • 


Collecting together these results yields (5.8). 

Setting p = I and applying Gauss’ theorem for the value of 2.^1 (a;, 5; c; 1), we deduce 
that V(X, X) is given by (5.9). □ 

We remark that the absolute convergence of the series in Corollary 5.5 follows from 
the absolute convergence of Gauss’ hypergeometric series. As a consequence, the series 
(5.5) converges absolutely because the matrix A has norm less than 1. 
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5.3 The bivariate gamma distribution 


Proposition 5.6. Suppose that the random vector {X, Y) is distributed according to a 
Sarmanov bivariate gamma distribution, as given by (3.10). Then, 


V\X,Y) = 22(2-«-/3) ^ 

j,k=l 


(5.13) 


where 


-^j,k (*^) 


k ( 

V ) 

^ A ^ ~ 2 F 1 {-j -k + 2,l-a-j;2-2a-j - k- 2). 

T{a + j)T{a + k) ^ ^ ^ J ^ J 


Proof. By (3.10), there holds the expansion, 


0x,y(T,?/)-0x(T)0y(i/) = ^\x) ^\y), 

n=l 


x,y > 0 . Then, it follows from (4.1) that for s G M, 

poo 

Pn(s) = / exp(isa:) LI“"'>(x)()iv(i) da: 


r(a) 


exp 


( — (1 — is)a:) x" ^ ^^(a:) dx. 


By a direct calcnlation nsing (3.9), we obtain 

1/2 

/ I f r I.. \ 

VJs) = 


and, analogonsly. 


= ' (1 - is)-("+’") (-is)" 

Quit) = ' (1 - if)-(^+") (-it)". 


t e 


We now calcnlate the integral 


ax, / X ax, / X ds / (a)n (aik 

Vj{s)Vk{-s) — = I ^ ^ 


1/2 


I -i+fc 


where 


jl k\ 

g[s) = S^+^-2 (1 - is)-("+^) (1 + is)-(“+''). 


g{s)ds, 


(5.14) 

(5.15) 
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s G M. To calculate the integral on the right-hand side of (5.14), we utilize Cauchy’s 
beta integral [1, p. 48] which provides that, for a,u,v G C such that Re(a) > 0 and 
Re(M -|- u) > 1 , 


(1 - is)-“ (1 + ias)-" ds = (a + 1)^-“-". 


(5.16) 


r(u)r(u) 

To differentiate the left-hand side of (5.16) m times with respect to a, we apply the 
formula, 

(£)™(1 + ias)-" = (1 + ias)-^-^; 

by differentiating under the integral we obtain 

(-^Y is)““ (1 + ias)-’’ds = (-i)"*(u)™ f s™ (1 - is)-“ (1 + ias)-’’”'" ds. 

\daJ J jj 

To differentiate the right-hand side of (5.16) m times with respect to a, we apply 
Leibniz’s formula. 


5 \"* r 


.da 

Noting that 


a“ (a -|- 1) 


1 — U—V 


E 

e=o 


m 


d 


U—1 


f) \ m—£ 

ITa) 


m\ 


and 


we obtain 
d \m 
.da 


(^Y~\a + 1)^-“-" = (-1)™ (a + {u + v-l) ^ 

\da/ [2 — u — v — m)£ 


a“-^(a + 1 ) 1 -“-^ 

= (-l)”*a“-^(a + l)^-“-^-'"l 


u 


+II ~ i)m 




£=0 


i\ (2 — u — V — m)£ 


= {- l )^ a^-\a + 1)^-“-"-'"(m + u - !),„ 2^1 (^ - m, 1 - u; 2 - m - u - m; . 

Comparing the derivatives of the left- and right-hand sides of (5.16), we obtain 

[ s™(l-is)-“(l + ias)-"-”* ds 

Jr 

,1-u-v-m'^iu + V -1) 


= 27r(-i)”^a“-^(a + l)- 
{u + v - 1), 


r(u)r(u) 


X 


(ll)r 


Fi ( — m,l — U]2 — u — V — m; 


Qj 
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Substituting a = l, m = j + k — 2, u = a + j, and v = a + k — m = a — j + 2, the 
latter equation reduces to 

r( 2 a + 1 ) 


g{s)ds = 2 


_ c\—2cx—j—k-\-2 


TT —1 


\j+k-2 


T{a + j)T{a- j + 2) 


X . ^ ^ 2^^! ( - j - + 2,1 - a - j; 2 - 2a - j - fc; 2). 


{a — j + 2)j^k-2 


Therefore, 


ds 


V,{s)Vk{-s)^ = 2 

> o 


-2q+1 


7r(-l) 


j-i j (a)j {a)k \ r( 2 a + j + k - 1) 
j\k\ ) r(a + j)r(a + A;) 


X 2 -A^i ( - j - A; + 2,1 - a - j; 2 - 2a - j - A;; 2), 

and similarly for Y. Substituting these expressions into Theorem 4.1 and simplifying 
the outcome, we obtain the series (5.13) as a formal expression for V^(X, Y). 

Finally, we verify that (5.13) converges absolutely. By (5.15), 

[ |l7(s)|ds= [ (1 + s 2 )-( 2 a+i+fc )/2 ds. 

’J M. t/ M 

Making the change-of-variables = A/(l — t), the latter integral is transformed to 


tlU+t-3) (1 _ ilU + k - l),a+ i) , 


(5.17) 


where •) is the classical beta function, and this integral converges absolutely because 
j + A: — 1 > 0 and a + 1/2 > 0 for all j,k E N and a > 0. Hence, to establish that 
(5.13) converges absolutely, we have only to show that the series 


OO OO 


j=i k=i 


^a)i(/^)j 

(j 


L2 


1/2 


[a)k {P)k 
(A;!)2 


1/2 


X B (|(j + k - l),a+ 1 ) B (|(j + k - l),f3 + 1 ) 

converges absolutely. 

For j + A: > 3, it follows from (5.17) that 

B ilU + k-l),a + l) < f (l-f)“"^dt = 

Jo 

Therefore, (5.18) is bounded above by 

i?(^,a + |)i?(2^/3 + ^) + 


(5.18) 


“ + 2 


_ (/^)j (/^)fc vl+fc 

ii r/ii (hw 


(a + !)(/?+ I) (j!) (A;!) 


j+fc>3 


< /3^A^i?(^,a + ^) i?(|,/3 + 2) -|- 
= jJ'^X^ B{\,a + \) B{\,I3 + |) + 


^ (a + |)(/3 + ^) V (j!) 


ii(5:)§C(E 

j=0 


k=0 


W)k .k 


(a + |)(/3 + |) 


17(1-^) 


- M-2/3 
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for A G [0,1). Hence (5.18), and then also (5.13), converges absolutely for all a, {3 and 
all A e [0,1). □ 


In calculating the distance variances V{X,X) and V(E, E), it is only the marginal 
distributions of X and Y which are relevant. Therefore, we may assume that X and 
Y have any joint distribution for which the marginal distributions are gamma with 
parameters a and (3, respectively. 

Holding a hxed and setting (3 = a, the Sarmanov bivariate gamma distribution 
reduces to the Kibble-Moran distribution, and the characteristic function of {X, Y) is 

'0x,y(H,^2) = [(1 - iH)(l - 1^2) + (5.19) 

see [11, p. 436]. Next, we let A —>■ 1—; then t/’x,y(HR2) converges to 

[1 - i (H + ^2)] ~“ = E exp (i (ti + t 2 )X ), 


proving that if A = 1 then X = Y, almost surely. Therefore, the distance variance 
V{X,X) is a limiting case of V{X,Y), viz.. 


V^{X,X) = \ [ liJxis+ t) 


7i Jr2 

1 






A-s-l- 7J' 
52 


(3=a 


= lim V\X,Y) 

A^l- 


/3=a 


Analogously, by holding (3 hxed and then setting a = (3, we obtain 


V2(E,E) = lim V^(X,E) 

A—>-1— (y.=l3 

As a remark on the gamma distributions, note that if we replace aj in (5.13) by 
[(/^)i/(®)i]^'^^ Aj then the result above generalizes to the distribution functions intro¬ 
duced by Griffiths [8]; see also (3.12) and (3.13). 

As noted in Subsection 5.3, a ^ f3 then Corr(X, Y) ^ A, and then it is impractical 
to compare Corr(X, Y) with p, the correlation coefficient in the bivariate normal case. 
\i a = (3 then Corr(X, E) = A, so we will consider only the case in which a = (3. 

In Figure 1, we graph the difference between the distance correlation coefficient of 
the Kibble-Moran bivariate gamma distribution, with shape parameter a, and the dis¬ 
tance correlation coefficient of the bivariate normal distribution. The graphs are given 
for the cases in which a = 0.1,1,10. Figure 1 suggests that the distance correlation 
coefficient TZ{X, E) converges, as a —)■ 00 , to the distance correlation coefficient for the 
bivariate normal distribution. 





Distance Correlation and Lancaster Distributions 


23 



Figure 1: Graphs of the difference between the distance correlation coefficient 
of the Kibble-Moran bivariate gamma distribution, with shape parameter a, 
and the distance correlation coefficient of the bivariate normal distribution. 

The graphs are given for the cases in which a = 0.1, 1, and 10. 

This result can be proved as follows: Let {X, Y)^ denote a bivariate gamma random 
variable with probability density function (3.10) where a = /3. It follows from the char¬ 
acteristic function (5.19) that if (Xi, Yi)q,, ..., (X„, y„)„ are independent, identically 
distributed random vectors with the same distribution as {X,Y)a then (Xi,Yi)q, -|- 
■ ■ ■ (X„, Yn)a has the same distribution as (X, Y)na- Since (X, Y)a has hnite mean 

and covariance matrix then, by the Central Limit Theorem, (X, Y)na ~ E(X, Y)na con¬ 
verges, as n —)■ cx), to a bivariate normal distribution, so 7^((X, Y)na) converges to the 
distance correlation coefficient of the bivariate normal distribution. 

Equivalently, as a —>■ oo, 71{{X, Y)a) converges to the distance correlation coefficient 
of the bivariate normal distribution. Moreover, as the graphs indicates, the rate of 
convergence is rapid. Indeed, for a = 0.1, we observe from Figure 1 that the maximum 
absolute difference between TZ{{X,Y)a) and the distance correlation of the bivariate 
normal distribution is less than 0.02; and at a = 10, the maximum absolute difference 
already is negligible. 

5.4 The bivariate Poisson distribution 

Proposition 5.7. Suppose that the random vector (X, X) is distributed according to a 
bivariate Poisson distribution, as given by (3.15). Then 

“ Aj+k-l 
j,k=l 


(5.20) 
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where 


Ajk — 


ij - 1 )! 


lU ^)/2j / . 7 \ 

E [it 




e=o 


for j > k, and Ajk = Akj for j <k. 

Proof. By (3.15) and (4.1), we have 

Vn{s) = Qn{s) = E exp(isX) C„(X;a), 

s G M. Substituting the dehnition (3.14) of the Poisson-Charlier polynomials into 
the expectation and reversing the order of summation, we obtain 

Vn{s) = Qn{s) = y^exp(isa;)C„(a;;a)-— 

x\ 

x=0 

/n^\ 1/2 

= (-) (l-e^Texp(-a(l-e-)). 

Therefore, for j, /c > 1, 

JR ^ 

/ \ 1/2 f e 

= (vn) / “P -<!'• + 1 - e--)) ^ 

J ‘ ‘ K. 

,n^+k\i /2 r Hs 

= (-m) / (l-<i'*)yi-e-‘*)‘exp(-2o(l-coss))-. (5.21) 

Changing variables in this integral from s to —s shows that the integral is symmetric 
in j and fc; therefore we assume, without loss of generality, that j > k. We now write 

(1 - e‘*)^(l - e-'")'' = (1 - e^*)^-''(l - e‘")^(l - e”^")^ 

= (1 - e'*)^-^( 2 (l -coss))^ 

and apply the binomial theorem in the form, 

(1 — = (1 ~ cos s — i sin sy~^ 


Then, it follows that the integral in (5.21) equals 

j-k 


f (sins)^(l-coss)^-^exp(-2a(l-coss))^. (5.22) 
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Expanding the exponential term, 


exp ( — 2 a(l — cos s)) = 


- 2 a)^ 


m=0 


m\ 


■( 1 -coss)™, 


applying the half-angle identities, sins = 2 sin |scos |s and 1 — coss = 2 (sin |s)^, and 
integrating term-by-term, we deduce that (5.22) equals 

j-k 


2‘E (Cl HtE 


e=o 


ds 


m=0 


ml 


( 2 sin |scos p)'( 2 (siii 


gLtL,-. 


£=0 


i)'E 

m=0 


-ay 


ds 


ml 


_ 2 j+k+ 2 m / (cosis)^(sin|s)2(^+”^)-^ 2 
./n S 


(5.23) 


If £ is odd then the latter integral is an odd function of s, so the integral equals 0. 
Hence, (5.23) equals 


L(i-fc)/ 2 j 

E 

e=o 


j-k 

2i 


ds 


(_i) 2 ^ / (cos|s) 2 ^(sinis) 2 (^+— 


771=0 


where [{j — k)/2\ denotes the greatest integer less than or equal to (j — k)/2. 
Next, we introduce the formula 


i.>2ids_(2(’-l)!!(2«:-3)!!7r 


(cosls)"(smli!)“^ = 


(2f+2*:-2)!! 2’ 


(5.24) 


£ = 0,1, 2 ..., fc = 1, 2, 3,.... This result is well-known for the case in which £ = 0 (see 
[7, p. 483, 3.821(10)]), and the general case can be established by induction on £ with 
the inductive step being obtained by writing 


sin^ ^s) 


(cos = (cos |s)^^(cos |s)^ = (cos |s)^^(l — ..... 2 

Hence, we hnd that (5.23) equals 
L(j-fc)/2j 


TT 

2 


h V 2£ / ^ ^ m\ (2(j+m)-2)!! ' 


Writing each double factorial in terms of rising factorials, and simplifying the resulting 
expressions, we hnd that this sum equals 


TT 

2 


L 0 -fc)/ 2 j 




2 j+fc 


_i L(i-fe)/2j 


= TT 


ij - 1 )! 


/ —, , V 

E vj) (-i)'O)' (it-'-. .T (2 - e - hr, -4a). 

£=0 ^ 2 
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Substituting this result into (5.23), we obtain 


/fli+fcx 1/2 /j_u\ 

(/!«) (ETT)! E 


= TT 


Substituting this result into Theorem 4.1 and simplifying the outcome, we obtain the 
series (5.20) as a formal expression for V‘^{X,Y). 

Finally, we establish the absolute convergence of the series (5.20). On applying to 
(5.21) the identity 

|1 — e^®| = |1 — e“‘^| = (2(1 — coss))^'^^ = 2(sin^ 
and the inequality 


exp ( — 2(1 — coss)) < 1, 


s G M, we obtain 




< 


< 


1/2 


3\k\ 

(4a)^+^\ 1/2 
j\k\ ) 


f] S 

11 - e'7|l - e'^l^exp ( - 2a(l - coss)) — 

\ y 

2 1 ( i +*:)/2 ds 


sm 


1 

2*1 „2 • 


By the Cauchy-Schwarz inequality. 


/ • 2 1 '\(/+*')/2ds /" / • 2 1 \i/2 / • 2 1 \fc/2ds 

(sin 2 ^) ^=/(sni 2 s) (sm ^5) ^ 

s ./o s 


< 








• 2 1 \k , 

Sin 2^ ■ 


Since (2k — 3)U/(2k — 2)!! < 1 for all fc G N then it follows from (5.24) with £ = 0 that 


• 2 1 // ds 
sm |s) -Y <n; 


therefore. 




< 


(4a)/+*^y/2 

“W 


and the same holds for the functions Qj. Substituting these bounds into the general 
series expansion (4.3), we obtain the upper bound 


OO OO 


V^(X,y) < = (exp(4Aa)-l)'<oo, 


j=l k=l 
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for all A G [0,1] and a > 0. Therefore, the series (5.20) converges absolntely. □ 

To calcnlate the distance variance, the argnment given in the bivariate gamma case 
remains valid here. By Kondon [14, p. 103], the characteristic fnnction of {X,Y) is 

'0A,y(s, t) = exp [a(l — A)(e‘^ ~ 1) + o(l ~ “ 1) + — 1)]. 

Therefore, 

lim = exp [a(e'^*+*^ - l] =i/>x(s + t), 

A—^1— 

SO we obtain 

V^(X,X) =V^(Y,Y) = lim V\X,Y). 


5.5 The bivariate negative binomial distribution 


Proposition 5.8. Suppose that the random vector (X, Y) is distributed according to a 
bivariate negative binomial distribution, as given by (3.17). Then, 

V2(X, r) = (1 - ^ (5.25) 

j,k=i 

where 


= X] 

U,^2=0 

|U-^2| 


^ ^ 1=0 


l\ 


1 + Y 


X 


E 




Ml — (!. 2 \ — tA)\ i2m)\ ik + m — l)\ 

m=0 '' ' / \ \ / 


2 Fi{-i, k + m— B fc + m; 2), 


for j > k, and Ajk = A^j for j <k. 
Proof. By (3.17), 


n=l 


x,y ^ Nq. Then, it follows from (4.1) that for s G M, 

V„(s) = C„(s) = E exp(isX) M^x(x) 

= f; exp(ii,i) (1 - ATh 


x=0 


Snbstitnting the definition of the Meixner polynomial as given in (3.16), we obtain 

V4,) = (1 -c/ (G(^)‘'^f]exp(i..)GLL C (-nh(-xh 

\ Tlj. / fJC 


x=0 


E 

k=0 


mi,k\ 


(i-c-y. 
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Letting u = ce'^ and interchanging the order of summation, we obtain 

V4s) = (1 - 0)^ (^) 

Noting that 


k=0 


x=0 


\k^k f ^ \ 


we obtain 


{-x)kU^ = — j 


y-v {—x)k{(i)x ^ ^ f ^ { 1^)3 


x=0 


du I ^' x\ 

x=0 




(5.26) 


Substituting this result into the inner summation in (5.26) and simplifying the resulting 
expression, we obtain 


V4s) = {l-cf{‘^y\l-ce'T^± 

k=0 


i-n)k Al - c)e‘‘ 


k\ 


(1 _ ,)/= _ ee'*)-'> (1 - 

^ ^ V n! / ^ ^ V 1-ce'® 

(1 - cf - ce‘")-^-"(l - e^*)". 


1 — ce' 

n 


Therefore, for j, /c > 1, 


r- 

X /(l - ce'')-'*-'(l - ce-'")-'’-* (1 - e'*)' (1 - e-‘*)‘ 


(5.27) 


Changing the variable of integration from s to —s shows that (5.27) is symmetric in j 
and k] so we assume, without loss of generality, that j > k. 

Next, we write the integrand in (5.27) in the form 

(1 - ce^")-^-^(l - ce-‘")-^-^ (1 - e^*)^' (1 - e"'")^ 

= [(1 - ce‘")(l - ce-‘")]-^-^[(l - e‘")(l - e-‘")]^(l - ce-‘")^-^(l - e'*)^'"^ 

= (1 + - 2ccoss)-^-^'[2(1 - coss)]''(l - ce-^'*)^-^(l - e'")^”^ 

= 2^(1 + c2)-^-^Yi- ‘^cossy^~\l-cossf(l-ce-'^y-\l - e^y-\ 

\ 1 + / 
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By the binomial theorem, 


j-k 




h,(.2=0 


4 


SO it follows that the integral in (5.27) is a linear combination of integrals of the form 

ds 


2c \ -/3-i 

1 — ^^ cos 


s) ^(l-coss)^e-‘(^^-'^^)"- 


1 + c^ 

which, after symmetrizing with s replaced by —s, eqnals 


1 

2 


1 - 


2c 


—^—t ds 

(1 - cossf cos {(ii - i 2 )s)—. 

gZ 


1 - 


1 + C^ 

2c 


To calculate this integral, we write 


2c \ -/5-i 

1 — ^^ cos s 


1 + c^ 


E 

e=o 


{P + j)i 


i\ 


2c 

cos s 


1 + c^ 


and apply the Chebyshev polynomials T„, given by 

lu-oi 

cos((C-^ 2 )s) = Te^-e 2 {cos s) = (|C-^ 2 |)! ^ (-2)' 


(m)|£i-£2| 


m=0 


( 1^1 — h\ — ir)! (2m) 


(1—coss)™'; 


see [7, p. 1056, 8.942(1)]. Then, we see that we need to calculate 


(coss)^(l-coss)^+™^. 


Using the standard half-angle transformations for the cosine function and applying 
(5.24), we obtain 


(coss)'^(l-coss)'=+”*^ = / (l-2sin"|s)'(2sin"|s)^+”*^ 


ds 


2 1 


-2 1 \k+m 


d5 


= 2 


fc+m 


E 


r=0 


-2y 


= 2 


k-hm— 


■-E 

r=0 


:-2) 


(sin" 


. (2r + 2k + 2m — 3)!! 
(2r + 2k + 2m — 2)!! 


(5.28) 


Expressing these double factorials in terms of rising factorials, we hnd that (5.28) equals 

ofe+m—1 / 1 N 

T - ^ ^ 2 Fi{-i, k + m-\]k + m]2). 

[k + m — 1)\ 

Collecting together all terms, we obtain (5.25). 

Finally, a proof of the absolute convergence of (5.25) can be obtained using ar¬ 
guments similar to those used to establish convergence in the previous subsections. 

□ 
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6 Summary and conclusions 

We have derived a general series expansion for the distance covariance and distance 
correlation coefficients for the class of Lancaster distribntions. This resnlt resolves the 
fnndamental obstacle arising in calculating the singular integrals used to define distance 
correlation. We have established the utility of the result by applying it to derive 
the distance correlation for the bivariate normal distribution and its generalizations 
of Lancaster type, the multivariate normal distributions, and the bivariate gamma, 
Poisson, and negative binomial distributions which are of Lancaster type. 

In computing any of the series obtained in this paper, we can derive upper bounds on 
the maximum discrepancy arising from the use of a hnite number of terms of the series 
by applying well-known methods of Kotz, et ah [12] together with classical bounds for 
the various hypergeometric series appearing in the expansions. 
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